[Entity Analytics] Update host.ip aggregation to remove painless script#252426
[Entity Analytics] Update host.ip aggregation to remove painless script#252426ymao1 merged 2 commits intoelastic:mainfrom
host.ip aggregation to remove painless script#252426Conversation
host.ip aggregation to remove painless script
|
Pinging @elastic/security-entity-analytics (Team:Entity Analytics) |
💚 Build Succeeded
Metrics [docs]
History
cc @ymao1 |
|
Thanks @ymao1 . I tested this locally with two indices (hosts-ip with host.ip mapped as ip, and hosts-keyword with host.ip mapped as keyword) and verified the following: The value_type: 'ip' approach causes a shard failure on keyword-mapped indices: This means IPs that only exist in keyword-mapped indices are silently dropped from the host/user details flyout without any warning or error shown to the user. Below screenshot shows that there are 4 ip addresses in the hosts table but only 2 in the flyout which are aggregated using the
I understand the performance benefit of removing the painless script. However, could you clarify:
CC : @jaredburgettelastic |
|
@abhishekbhatia1710 Thanks for desk testing! The original painless script was introduced to handle IP data that was incorrectly |
Thanks for the context, that makes sense. Given this is an ingest-side edge case, i think we can consider this as a tradeoff. Approving! |
abhishekbhatia1710
left a comment
There was a problem hiding this comment.
LGTM, code and desk tested!
* commit '7dcc1fe3c205d2de0c3ca3f65804f21de09013c3': (285 commits) Enrich kbn-check-saved-objects-cli README with CI and manual usage docs (elastic#252557) [Discover] Add feature flag to make ESQL the default query mode (elastic#252268) Add maskProps.headerZindexLocation above to inspect component flyout (elastic#252543) [Security Solution][Atack/Alerts] Flyout header: Assignees (elastic#252190) Upgrade EUI to v112.3.0 (elastic#252315) [Fleet] Make save_knowledge_base async in streaming state machine (elastic#252328) Upgrade @smithy/config-resolver 4.3.0 → 4.4.6 (elastic#252457) [Lens as API] Add colorMapping support for XY charts (ES|QL data layers) (elastic#252051) [WorkplaceAI] Add Google Drive data source and connector (elastic#250677) [Scout] Move GlobalSearch FTR tests to Scout (elastic#252201) [EDR Workflows] Fix osquery pack results display when agent clock is skewed (elastic#251417) [Observability Onboarding] Apply integrations limit after dedup in parseIntegrationsTSV (elastic#252486) [Entity Analytics] Update `host.ip` aggregation to remove painless script (elastic#252426) Address `@elastic/eui/require-table-caption` lint violations across `@elastic/obs-presentation-team` files (elastic#251050) Consolidate JSON stringify dependencies (elastic#251890) [index mgmt] Use esql instead of query dsl to get the index count (elastic#252422) Add Usage API Plugin (elastic#252434) Cases All Templates page (elastic#250372) [Agent Builder] Default value for optional params in ESQL tools (elastic#238472) [Fleet] Add upgrade_details.metadata.reason to AgentResponseSchema (elastic#252485) ...
|
Starting backport for target branches: 9.3 https://github.com/elastic/kibana/actions/runs/22315605462 |
|
Starting backport for target branches: 9.3 https://github.com/elastic/kibana/actions/runs/22315605302 |
…ript (elastic#252426) ## Summary The original reason for introducing this painless script into the `host.ip` aggregation was because the normal aggregation would fail when aggregating over data where the `host.ip` field had a mixed mapping (mapped as `keyword` in one index and `ip` in another). With the introduction of the `value_type` specification in Elasticsearch, we can now choose which value type to use when there are conflicts. This PR removes the inefficient painless script in the `host.ip` aggregation for the standard terms agg with a `value_type` specification. ## To Verify **Verify that the host and user flyouts show aggregated IP information** 1. Start ES and Kibana and load some entity data that includes `host.ip` info 2. Open the host and user flyouts from the Explore and verify that IP information is populated in the observed details **To recreate the original problem:** 1. Start ES and Kibana and go to the Dev Console 2. Create 2 indices with `host.ip` and `timestamp` fields. Notice one index has `host.ip` mapped as a `keyword` and one has `host.ip` mapped as `ip`. Index some documents <details> <summary> Dev Console Commands </summary> ``` PUT hosts-keyword { "mappings": { "properties": { "host.ip": { "type": "keyword" }, "timestamp": { "type": "date" } } } } POST hosts-keyword/_bulk {"index":{}} {"host.ip":"192.168.1.1","timestamp":"2025-02-09T10:00:00Z"} {"index":{}} {"host.ip":"10.0.0.5","timestamp":"2025-02-09T10:01:00Z"} {"index":{}} {"host.ip":"172.16.0.100","timestamp":"2025-02-09T10:02:00Z"} {"index":{}} {"host.ip":"192.168.2.50","timestamp":"2025-02-09T10:03:00Z"} {"index":{}} {"host.ip":"10.0.1.20","timestamp":"2025-02-09T10:04:00Z"} {"index":{}} {"host.ip":"203.0.113.42","timestamp":"2025-02-09T10:05:00Z"} {"index":{}} {"host.ip":"198.51.100.10","timestamp":"2025-02-09T10:06:00Z"} {"index":{}} {"host.ip":"192.168.0.1","timestamp":"2025-02-09T10:07:00Z"} {"index":{}} {"host.ip":"10.10.10.10","timestamp":"2025-02-09T10:08:00Z"} {"index":{}} {"host.ip":"172.31.255.1","timestamp":"2025-02-09T10:09:00Z"} PUT hosts-ip { "mappings": { "properties": { "host.ip": { "type": "ip" }, "timestamp": { "type": "date" } } } } POST hosts-ip/_bulk {"index":{}} {"host.ip":"192.168.1.1","timestamp":"2025-02-09T10:00:00Z"} {"index":{}} {"host.ip":"10.0.0.5","timestamp":"2025-02-09T10:01:00Z"} {"index":{}} {"host.ip":"172.16.0.100","timestamp":"2025-02-09T10:02:00Z"} {"index":{}} {"host.ip":"192.168.2.50","timestamp":"2025-02-09T10:03:00Z"} {"index":{}} {"host.ip":"10.0.1.20","timestamp":"2025-02-09T10:04:00Z"} {"index":{}} {"host.ip":"203.0.113.42","timestamp":"2025-02-09T10:05:00Z"} {"index":{}} {"host.ip":"198.51.100.10","timestamp":"2025-02-09T10:06:00Z"} {"index":{}} {"host.ip":"192.168.0.1","timestamp":"2025-02-09T10:07:00Z"} {"index":{}} {"host.ip":"10.10.10.10","timestamp":"2025-02-09T10:08:00Z"} {"index":{}} {"host.ip":"172.31.255.1","timestamp":"2025-02-09T10:09:00Z"} ``` </details> 3. Try a normal terms aggregation against these two indices. You should see the aggregation fail and an error in the Elasticsearch logs: ``` GET hosts-ip,hosts-keyword/_search { "size": 0, "aggs": { "host_ip": { "terms": { "field":"host.ip", "size": 10, "order": { "timestamp": "desc" } }, "aggs": { "timestamp": { "max": { "field": "timestamp" } } } } } } ``` ``` java.lang.IllegalArgumentException: Failed trying to format bytes as IP address. Possibly caused by a mapping mismatch ``` 4. Add `value_type` to the terms aggregation. You should see the aggregation response with a shard failure indicating an illegal argument exception, but the aggregation should be performed over the correctly `ip` mapped data ``` GET hosts-ip,hosts-keyword/_search { "size": 0, "aggs":{ "host_ip": { "terms": { "field":"host.ip", "value_type": "ip", "size": 10, "order": { "timestamp": "desc" } }, "aggs": { "timestamp": { "max": { "field": "timestamp" } } } } } } ``` (cherry picked from commit 586fcba)
💚 All backports created successfully
Note: Successful backport PRs will be merged automatically after passing CI. Questions ?Please refer to the Backport tool documentation |
…ript (elastic#252426) ## Summary The original reason for introducing this painless script into the `host.ip` aggregation was because the normal aggregation would fail when aggregating over data where the `host.ip` field had a mixed mapping (mapped as `keyword` in one index and `ip` in another). With the introduction of the `value_type` specification in Elasticsearch, we can now choose which value type to use when there are conflicts. This PR removes the inefficient painless script in the `host.ip` aggregation for the standard terms agg with a `value_type` specification. ## To Verify **Verify that the host and user flyouts show aggregated IP information** 1. Start ES and Kibana and load some entity data that includes `host.ip` info 2. Open the host and user flyouts from the Explore and verify that IP information is populated in the observed details **To recreate the original problem:** 1. Start ES and Kibana and go to the Dev Console 2. Create 2 indices with `host.ip` and `timestamp` fields. Notice one index has `host.ip` mapped as a `keyword` and one has `host.ip` mapped as `ip`. Index some documents <details> <summary> Dev Console Commands </summary> ``` PUT hosts-keyword { "mappings": { "properties": { "host.ip": { "type": "keyword" }, "timestamp": { "type": "date" } } } } POST hosts-keyword/_bulk {"index":{}} {"host.ip":"192.168.1.1","timestamp":"2025-02-09T10:00:00Z"} {"index":{}} {"host.ip":"10.0.0.5","timestamp":"2025-02-09T10:01:00Z"} {"index":{}} {"host.ip":"172.16.0.100","timestamp":"2025-02-09T10:02:00Z"} {"index":{}} {"host.ip":"192.168.2.50","timestamp":"2025-02-09T10:03:00Z"} {"index":{}} {"host.ip":"10.0.1.20","timestamp":"2025-02-09T10:04:00Z"} {"index":{}} {"host.ip":"203.0.113.42","timestamp":"2025-02-09T10:05:00Z"} {"index":{}} {"host.ip":"198.51.100.10","timestamp":"2025-02-09T10:06:00Z"} {"index":{}} {"host.ip":"192.168.0.1","timestamp":"2025-02-09T10:07:00Z"} {"index":{}} {"host.ip":"10.10.10.10","timestamp":"2025-02-09T10:08:00Z"} {"index":{}} {"host.ip":"172.31.255.1","timestamp":"2025-02-09T10:09:00Z"} PUT hosts-ip { "mappings": { "properties": { "host.ip": { "type": "ip" }, "timestamp": { "type": "date" } } } } POST hosts-ip/_bulk {"index":{}} {"host.ip":"192.168.1.1","timestamp":"2025-02-09T10:00:00Z"} {"index":{}} {"host.ip":"10.0.0.5","timestamp":"2025-02-09T10:01:00Z"} {"index":{}} {"host.ip":"172.16.0.100","timestamp":"2025-02-09T10:02:00Z"} {"index":{}} {"host.ip":"192.168.2.50","timestamp":"2025-02-09T10:03:00Z"} {"index":{}} {"host.ip":"10.0.1.20","timestamp":"2025-02-09T10:04:00Z"} {"index":{}} {"host.ip":"203.0.113.42","timestamp":"2025-02-09T10:05:00Z"} {"index":{}} {"host.ip":"198.51.100.10","timestamp":"2025-02-09T10:06:00Z"} {"index":{}} {"host.ip":"192.168.0.1","timestamp":"2025-02-09T10:07:00Z"} {"index":{}} {"host.ip":"10.10.10.10","timestamp":"2025-02-09T10:08:00Z"} {"index":{}} {"host.ip":"172.31.255.1","timestamp":"2025-02-09T10:09:00Z"} ``` </details> 3. Try a normal terms aggregation against these two indices. You should see the aggregation fail and an error in the Elasticsearch logs: ``` GET hosts-ip,hosts-keyword/_search { "size": 0, "aggs": { "host_ip": { "terms": { "field":"host.ip", "size": 10, "order": { "timestamp": "desc" } }, "aggs": { "timestamp": { "max": { "field": "timestamp" } } } } } } ``` ``` java.lang.IllegalArgumentException: Failed trying to format bytes as IP address. Possibly caused by a mapping mismatch ``` 4. Add `value_type` to the terms aggregation. You should see the aggregation response with a shard failure indicating an illegal argument exception, but the aggregation should be performed over the correctly `ip` mapped data ``` GET hosts-ip,hosts-keyword/_search { "size": 0, "aggs":{ "host_ip": { "terms": { "field":"host.ip", "value_type": "ip", "size": 10, "order": { "timestamp": "desc" } }, "aggs": { "timestamp": { "max": { "field": "timestamp" } } } } } } ``` (cherry picked from commit 586fcba)
💚 All backports created successfully
Note: Successful backport PRs will be merged automatically after passing CI. Questions ?Please refer to the Backport tool documentation |
…ess script (#252426) (#254549) # Backport This will backport the following commits from `main` to `9.3`: - [[Entity Analytics] Update `host.ip` aggregation to remove painless script (#252426)](#252426) <!--- Backport version: 9.6.6 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sorenlouv/backport) <!--BACKPORT [{"author":{"name":"Ying Mao","email":"ying.mao@elastic.co"},"sourceCommit":{"committedDate":"2026-02-10T14:54:13Z","message":"[Entity Analytics] Update `host.ip` aggregation to remove painless script (#252426)\n\n## Summary\n\nThe original reason for introducing this painless script into the\n`host.ip` aggregation was because the normal aggregation would fail when\naggregating over data where the `host.ip` field had a mixed mapping\n(mapped as `keyword` in one index and `ip` in another). With the\nintroduction of the `value_type` specification in Elasticsearch, we can\nnow choose which value type to use when there are conflicts. This PR\nremoves the inefficient painless script in the `host.ip` aggregation for\nthe standard terms agg with a `value_type` specification.\n\n## To Verify\n\n**Verify that the host and user flyouts show aggregated IP information**\n1. Start ES and Kibana and load some entity data that includes `host.ip`\ninfo\n2. Open the host and user flyouts from the Explore and verify that IP\ninformation is populated in the observed details\n\n**To recreate the original problem:**\n\n1. Start ES and Kibana and go to the Dev Console\n2. Create 2 indices with `host.ip` and `timestamp` fields. Notice one\nindex has `host.ip` mapped as a `keyword` and one has `host.ip` mapped\nas `ip`. Index some documents\n\n<details>\n<summary> Dev Console Commands </summary>\n\n```\nPUT hosts-keyword\n{\n \"mappings\": {\n \"properties\": {\n \"host.ip\": {\n \"type\": \"keyword\"\n },\n \"timestamp\": {\n \"type\": \"date\"\n }\n }\n }\n}\n\nPOST hosts-keyword/_bulk\n{\"index\":{}}\n{\"host.ip\":\"192.168.1.1\",\"timestamp\":\"2025-02-09T10:00:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"10.0.0.5\",\"timestamp\":\"2025-02-09T10:01:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"172.16.0.100\",\"timestamp\":\"2025-02-09T10:02:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"192.168.2.50\",\"timestamp\":\"2025-02-09T10:03:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"10.0.1.20\",\"timestamp\":\"2025-02-09T10:04:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"203.0.113.42\",\"timestamp\":\"2025-02-09T10:05:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"198.51.100.10\",\"timestamp\":\"2025-02-09T10:06:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"192.168.0.1\",\"timestamp\":\"2025-02-09T10:07:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"10.10.10.10\",\"timestamp\":\"2025-02-09T10:08:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"172.31.255.1\",\"timestamp\":\"2025-02-09T10:09:00Z\"}\n\nPUT hosts-ip\n{\n \"mappings\": {\n \"properties\": {\n \"host.ip\": {\n \"type\": \"ip\"\n },\n \"timestamp\": {\n \"type\": \"date\"\n }\n }\n }\n}\n\nPOST hosts-ip/_bulk\n{\"index\":{}}\n{\"host.ip\":\"192.168.1.1\",\"timestamp\":\"2025-02-09T10:00:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"10.0.0.5\",\"timestamp\":\"2025-02-09T10:01:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"172.16.0.100\",\"timestamp\":\"2025-02-09T10:02:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"192.168.2.50\",\"timestamp\":\"2025-02-09T10:03:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"10.0.1.20\",\"timestamp\":\"2025-02-09T10:04:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"203.0.113.42\",\"timestamp\":\"2025-02-09T10:05:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"198.51.100.10\",\"timestamp\":\"2025-02-09T10:06:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"192.168.0.1\",\"timestamp\":\"2025-02-09T10:07:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"10.10.10.10\",\"timestamp\":\"2025-02-09T10:08:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"172.31.255.1\",\"timestamp\":\"2025-02-09T10:09:00Z\"}\n```\n</details>\n\n3. Try a normal terms aggregation against these two indices. You should\nsee the aggregation fail and an error in the Elasticsearch logs:\n\n```\nGET hosts-ip,hosts-keyword/_search\n{\n \"size\": 0,\n \"aggs\": {\n \"host_ip\": {\n \"terms\": {\n \"field\":\"host.ip\",\n \"size\": 10,\n \"order\": {\n \"timestamp\": \"desc\"\n }\n },\n \"aggs\": {\n \"timestamp\": {\n \"max\": {\n \"field\": \"timestamp\"\n }\n }\n }\n }\n }\n}\n```\n\n```\njava.lang.IllegalArgumentException: Failed trying to format bytes as IP address. Possibly caused by a mapping mismatch\n```\n\n4. Add `value_type` to the terms aggregation. You should see the\naggregation response with a shard failure indicating an illegal argument\nexception, but the aggregation should be performed over the correctly\n`ip` mapped data\n\n```\nGET hosts-ip,hosts-keyword/_search\n{\n \"size\": 0,\n \"aggs\":{\n \"host_ip\": {\n \"terms\": {\n \"field\":\"host.ip\",\n \"value_type\": \"ip\",\n \"size\": 10,\n \"order\": {\n \"timestamp\": \"desc\"\n }\n },\n \"aggs\": {\n \"timestamp\": {\n \"max\": {\n \"field\": \"timestamp\"\n }\n }\n }\n }\n }\n}\n```","sha":"586fcba7f7d54a0bdf45972364b90b9d572045e8","branchLabelMapping":{"^v9.4.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Team:Entity Analytics","backport:version","v9.4.0","v9.3.2"],"title":"[Entity Analytics] Update `host.ip` aggregation to remove painless script","number":252426,"url":"https://github.com/elastic/kibana/pull/252426","mergeCommit":{"message":"[Entity Analytics] Update `host.ip` aggregation to remove painless script (#252426)\n\n## Summary\n\nThe original reason for introducing this painless script into the\n`host.ip` aggregation was because the normal aggregation would fail when\naggregating over data where the `host.ip` field had a mixed mapping\n(mapped as `keyword` in one index and `ip` in another). With the\nintroduction of the `value_type` specification in Elasticsearch, we can\nnow choose which value type to use when there are conflicts. This PR\nremoves the inefficient painless script in the `host.ip` aggregation for\nthe standard terms agg with a `value_type` specification.\n\n## To Verify\n\n**Verify that the host and user flyouts show aggregated IP information**\n1. Start ES and Kibana and load some entity data that includes `host.ip`\ninfo\n2. Open the host and user flyouts from the Explore and verify that IP\ninformation is populated in the observed details\n\n**To recreate the original problem:**\n\n1. Start ES and Kibana and go to the Dev Console\n2. Create 2 indices with `host.ip` and `timestamp` fields. Notice one\nindex has `host.ip` mapped as a `keyword` and one has `host.ip` mapped\nas `ip`. Index some documents\n\n<details>\n<summary> Dev Console Commands </summary>\n\n```\nPUT hosts-keyword\n{\n \"mappings\": {\n \"properties\": {\n \"host.ip\": {\n \"type\": \"keyword\"\n },\n \"timestamp\": {\n \"type\": \"date\"\n }\n }\n }\n}\n\nPOST hosts-keyword/_bulk\n{\"index\":{}}\n{\"host.ip\":\"192.168.1.1\",\"timestamp\":\"2025-02-09T10:00:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"10.0.0.5\",\"timestamp\":\"2025-02-09T10:01:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"172.16.0.100\",\"timestamp\":\"2025-02-09T10:02:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"192.168.2.50\",\"timestamp\":\"2025-02-09T10:03:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"10.0.1.20\",\"timestamp\":\"2025-02-09T10:04:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"203.0.113.42\",\"timestamp\":\"2025-02-09T10:05:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"198.51.100.10\",\"timestamp\":\"2025-02-09T10:06:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"192.168.0.1\",\"timestamp\":\"2025-02-09T10:07:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"10.10.10.10\",\"timestamp\":\"2025-02-09T10:08:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"172.31.255.1\",\"timestamp\":\"2025-02-09T10:09:00Z\"}\n\nPUT hosts-ip\n{\n \"mappings\": {\n \"properties\": {\n \"host.ip\": {\n \"type\": \"ip\"\n },\n \"timestamp\": {\n \"type\": \"date\"\n }\n }\n }\n}\n\nPOST hosts-ip/_bulk\n{\"index\":{}}\n{\"host.ip\":\"192.168.1.1\",\"timestamp\":\"2025-02-09T10:00:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"10.0.0.5\",\"timestamp\":\"2025-02-09T10:01:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"172.16.0.100\",\"timestamp\":\"2025-02-09T10:02:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"192.168.2.50\",\"timestamp\":\"2025-02-09T10:03:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"10.0.1.20\",\"timestamp\":\"2025-02-09T10:04:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"203.0.113.42\",\"timestamp\":\"2025-02-09T10:05:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"198.51.100.10\",\"timestamp\":\"2025-02-09T10:06:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"192.168.0.1\",\"timestamp\":\"2025-02-09T10:07:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"10.10.10.10\",\"timestamp\":\"2025-02-09T10:08:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"172.31.255.1\",\"timestamp\":\"2025-02-09T10:09:00Z\"}\n```\n</details>\n\n3. Try a normal terms aggregation against these two indices. You should\nsee the aggregation fail and an error in the Elasticsearch logs:\n\n```\nGET hosts-ip,hosts-keyword/_search\n{\n \"size\": 0,\n \"aggs\": {\n \"host_ip\": {\n \"terms\": {\n \"field\":\"host.ip\",\n \"size\": 10,\n \"order\": {\n \"timestamp\": \"desc\"\n }\n },\n \"aggs\": {\n \"timestamp\": {\n \"max\": {\n \"field\": \"timestamp\"\n }\n }\n }\n }\n }\n}\n```\n\n```\njava.lang.IllegalArgumentException: Failed trying to format bytes as IP address. Possibly caused by a mapping mismatch\n```\n\n4. Add `value_type` to the terms aggregation. You should see the\naggregation response with a shard failure indicating an illegal argument\nexception, but the aggregation should be performed over the correctly\n`ip` mapped data\n\n```\nGET hosts-ip,hosts-keyword/_search\n{\n \"size\": 0,\n \"aggs\":{\n \"host_ip\": {\n \"terms\": {\n \"field\":\"host.ip\",\n \"value_type\": \"ip\",\n \"size\": 10,\n \"order\": {\n \"timestamp\": \"desc\"\n }\n },\n \"aggs\": {\n \"timestamp\": {\n \"max\": {\n \"field\": \"timestamp\"\n }\n }\n }\n }\n }\n}\n```","sha":"586fcba7f7d54a0bdf45972364b90b9d572045e8"}},"sourceBranch":"main","suggestedTargetBranches":["9.3"],"targetPullRequestStates":[{"branch":"main","label":"v9.4.0","branchLabelMappingKey":"^v9.4.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/252426","number":252426,"mergeCommit":{"message":"[Entity Analytics] Update `host.ip` aggregation to remove painless script (#252426)\n\n## Summary\n\nThe original reason for introducing this painless script into the\n`host.ip` aggregation was because the normal aggregation would fail when\naggregating over data where the `host.ip` field had a mixed mapping\n(mapped as `keyword` in one index and `ip` in another). With the\nintroduction of the `value_type` specification in Elasticsearch, we can\nnow choose which value type to use when there are conflicts. This PR\nremoves the inefficient painless script in the `host.ip` aggregation for\nthe standard terms agg with a `value_type` specification.\n\n## To Verify\n\n**Verify that the host and user flyouts show aggregated IP information**\n1. Start ES and Kibana and load some entity data that includes `host.ip`\ninfo\n2. Open the host and user flyouts from the Explore and verify that IP\ninformation is populated in the observed details\n\n**To recreate the original problem:**\n\n1. Start ES and Kibana and go to the Dev Console\n2. Create 2 indices with `host.ip` and `timestamp` fields. Notice one\nindex has `host.ip` mapped as a `keyword` and one has `host.ip` mapped\nas `ip`. Index some documents\n\n<details>\n<summary> Dev Console Commands </summary>\n\n```\nPUT hosts-keyword\n{\n \"mappings\": {\n \"properties\": {\n \"host.ip\": {\n \"type\": \"keyword\"\n },\n \"timestamp\": {\n \"type\": \"date\"\n }\n }\n }\n}\n\nPOST hosts-keyword/_bulk\n{\"index\":{}}\n{\"host.ip\":\"192.168.1.1\",\"timestamp\":\"2025-02-09T10:00:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"10.0.0.5\",\"timestamp\":\"2025-02-09T10:01:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"172.16.0.100\",\"timestamp\":\"2025-02-09T10:02:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"192.168.2.50\",\"timestamp\":\"2025-02-09T10:03:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"10.0.1.20\",\"timestamp\":\"2025-02-09T10:04:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"203.0.113.42\",\"timestamp\":\"2025-02-09T10:05:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"198.51.100.10\",\"timestamp\":\"2025-02-09T10:06:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"192.168.0.1\",\"timestamp\":\"2025-02-09T10:07:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"10.10.10.10\",\"timestamp\":\"2025-02-09T10:08:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"172.31.255.1\",\"timestamp\":\"2025-02-09T10:09:00Z\"}\n\nPUT hosts-ip\n{\n \"mappings\": {\n \"properties\": {\n \"host.ip\": {\n \"type\": \"ip\"\n },\n \"timestamp\": {\n \"type\": \"date\"\n }\n }\n }\n}\n\nPOST hosts-ip/_bulk\n{\"index\":{}}\n{\"host.ip\":\"192.168.1.1\",\"timestamp\":\"2025-02-09T10:00:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"10.0.0.5\",\"timestamp\":\"2025-02-09T10:01:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"172.16.0.100\",\"timestamp\":\"2025-02-09T10:02:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"192.168.2.50\",\"timestamp\":\"2025-02-09T10:03:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"10.0.1.20\",\"timestamp\":\"2025-02-09T10:04:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"203.0.113.42\",\"timestamp\":\"2025-02-09T10:05:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"198.51.100.10\",\"timestamp\":\"2025-02-09T10:06:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"192.168.0.1\",\"timestamp\":\"2025-02-09T10:07:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"10.10.10.10\",\"timestamp\":\"2025-02-09T10:08:00Z\"}\n{\"index\":{}}\n{\"host.ip\":\"172.31.255.1\",\"timestamp\":\"2025-02-09T10:09:00Z\"}\n```\n</details>\n\n3. Try a normal terms aggregation against these two indices. You should\nsee the aggregation fail and an error in the Elasticsearch logs:\n\n```\nGET hosts-ip,hosts-keyword/_search\n{\n \"size\": 0,\n \"aggs\": {\n \"host_ip\": {\n \"terms\": {\n \"field\":\"host.ip\",\n \"size\": 10,\n \"order\": {\n \"timestamp\": \"desc\"\n }\n },\n \"aggs\": {\n \"timestamp\": {\n \"max\": {\n \"field\": \"timestamp\"\n }\n }\n }\n }\n }\n}\n```\n\n```\njava.lang.IllegalArgumentException: Failed trying to format bytes as IP address. Possibly caused by a mapping mismatch\n```\n\n4. Add `value_type` to the terms aggregation. You should see the\naggregation response with a shard failure indicating an illegal argument\nexception, but the aggregation should be performed over the correctly\n`ip` mapped data\n\n```\nGET hosts-ip,hosts-keyword/_search\n{\n \"size\": 0,\n \"aggs\":{\n \"host_ip\": {\n \"terms\": {\n \"field\":\"host.ip\",\n \"value_type\": \"ip\",\n \"size\": 10,\n \"order\": {\n \"timestamp\": \"desc\"\n }\n },\n \"aggs\": {\n \"timestamp\": {\n \"max\": {\n \"field\": \"timestamp\"\n }\n }\n }\n }\n }\n}\n```","sha":"586fcba7f7d54a0bdf45972364b90b9d572045e8"}},{"branch":"9.3","label":"v9.3.2","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Ying Mao <ying.mao@elastic.co>

Summary
The original reason for introducing this painless script into the
host.ipaggregation was because the normal aggregation would fail when aggregating over data where thehost.ipfield had a mixed mapping (mapped askeywordin one index andipin another). With the introduction of thevalue_typespecification in Elasticsearch, we can now choose which value type to use when there are conflicts. This PR removes the inefficient painless script in thehost.ipaggregation for the standard terms agg with avalue_typespecification.To Verify
Verify that the host and user flyouts show aggregated IP information
host.ipinfoTo recreate the original problem:
host.ipandtimestampfields. Notice one index hashost.ipmapped as akeywordand one hashost.ipmapped asip. Index some documentsDev Console Commands
value_typeto the terms aggregation. You should see the aggregation response with a shard failure indicating an illegal argument exception, but the aggregation should be performed over the correctlyipmapped data